智能论文笔记

An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Aly Mostafa , Omar Mohamed , Ali Ashraf , Ahmed Elbehery , Salma Jamal , Anas Salah , Amr S. Ghoneim

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-08-20

这项研究是有关阿拉伯历史文档的光学特征识别（OCR）的一系列研究的第二阶段，并研究了不同的建模程序如何与问题相互作用。第一项研究研究了变压器对我们定制的阿拉伯数据集的影响。首次研究的弊端之一是训练数据的规模，由于缺乏资源，我们的3000万张图像中仅15000张图像。另外，我们添加了一个图像增强层，时间和空间优化和后校正层，以帮助该模型预测正确的上下文。值得注意的是，我们提出了一种使用视觉变压器作为编码器的端到端文本识别方法，即BEIT和Vanilla Transformer作为解码器，消除了CNNs以进行特征提取并降低模型的复杂性。实验表明，我们的端到端模型优于卷积骨架。该模型的CER为4.46％。

translated by 谷歌翻译

Digital Rock Typing DRT Algorithm Formulation with Optimal Supervised Semantic Segmentation

Omar Alfarisi , Djamel Ouzzane , Mohamed Sassi , Tiejun Zhang

分类：机器学习 | 计算机视觉

2021-12-30

3D地质模型中的每个网格块都需要一种代表该块的所有物理和化学性质的岩石类型。分类岩石类型的性质是岩性，渗透性和毛细管压力。科学家和工程师使用传统的实验室测量确定这些性质，其将破坏性方法嵌入样品或改变其一些性质（即，润湿性，渗透率和孔隙率），因为测量过程包括样品粉碎，流体流动或流体饱和度。最近，数字岩体物理学（DRT）已经出现了从微观计算机断层扫描（UCT）和磁共振成像（MRI）图像中量化这些性质。然而，文献没有尝试以完全数字语境的摇滚打字。我们提出表演数字摇滚打字（DRT）：（1）整合最新的DRP在授予数字岩石属性确定的新工艺中; （2）数字化碳酸盐中最新的岩石打字方法，（3）引入了一种新颖的碳酸盐岩字打字过程，该过程利用计算机视觉功能，为异构碳酸岩纹理提供更多洞察力。

translated by 谷歌翻译

Morphology Decoder: A Machine Learning Guided 3D Vision Quantifying Heterogenous Rock Permeability for Planetary Surveillance and Robotic Functions

Omar Alfarisi , Aikifa Raza , Djamel Ouzzane , Hongxia Li , Mohamed Sassi , Tiejun Zhang

分类：计算机视觉 | 机器学习

2021-11-26

渗透性对天然液的流动性具有显性影响。格子Boltzmann模拟器确定纳米和微孔网络的渗透率。模拟器占据了数百万的流动动态计算，其累积的误差和高耗电量的计算能力。为了有效且始终如一地预测渗透性，我们提出了一种形态学解码器，从3D微型计算机层面扫描和核磁共振图像中提出了机器学习的平行和串行流量重建。对于3D视觉，我们将可控可测量的卷引入新的监督分段，其中一组独特的体素强度对应于晶粒和孔喉部尺寸。形态解码器以新颖的方式贬低并汇集形态边界以产生渗透性。形态学解码器方法由五种新方法组成，其中描述了本文，这些新方法是：（1）几何3D渗透率，（2）机器学习引导3D特性识别岩石形态，（3）3D图像特性集成模型的渗透率（4）MRI渗透成像器，（5）形态解码器（整合其他四个新颖过程的过程）。

translated by 谷歌翻译

Deducing of Optimal Machine Learning Algorithms for Heterogeneity

Omar Alfarisi , Zeyar Aung , Mohamed Sassi

分类：机器学习

2021-11-10

为了定义最佳机器学习算法，该决定并不容易，我们将选择它。为了帮助未来的研究人员，我们在本文中描述了最好的算法中的最佳状态。我们构建了一个合成数据集，并执行了5个不同算法的监督机器学习。对于异质性，我们确定了随机森林等，是最好的算法。

translated by 谷歌翻译

Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Omar Alfarisi , Aikifa Raza , Hongtao Zhang , Djamel Ozzane , Mohamed Sassi , Tiejun Zhang

分类：计算机视觉

2021-11-08

自动图像处理算法可以提高分类异构碳酸盐岩石形态的质量，效率和一致性，可以无缝地处理大量的数据和图像。地质学家面临困难在设定从岩石图像，微计算断层扫描（UCT）或磁共振成像（MRI）中确定岩石物理性质的最佳方法的方向。大多数成功的工作是来自同质岩石，专注于2D图像，较少关注3D并需要数值模拟。目前，图像分析方法会聚到三种方法：图像处理，人工智能和具有人工智能的组合图像处理。在这项工作中，我们提出了两种方法来确定3D UCT和MRI图像的孔隙率：具有图像分辨率的图像处理方法优化高斯算法（IROGA）;高斯随机森林机器学习差异（MLDGRF）启用先进的图像识别方法。我们已经建立了参考3D微型模型和收集的图像以校准Iroga和MLDGRF方法。为了评估这些校准方法的预测能力，我们在3D UCT和天然异质碳酸盐岩的MRI图像上运行它们。我们分别测量了三种行业标准方式的碳酸盐岩的孔隙度和岩性，分别为参考值。值得注意的是，与三种实验测量相比，IROGA和MLDGRF的精度产生96.2％和97.1％的精度为96.2％和97.1％，91.7％和94.4％。我们使用两种方法，X射线粉末衍射和晶粒密度测量测量石灰石和硫铁矿参考值。 MLDGRF生产岩性（石灰石和硫铁矿）卷，精度为97.7％。

translated by 谷歌翻译

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Omar Khattab , Keshav Santhanam , Xiang Lisa Li , David Hall , Percy Liang , Christopher Potts , Matei Zaharia

分类：自然语言处理

2022-12-28

Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM). Existing work has combined these in simple "retrieve-then-read" pipelines in which the RM retrieves passages that are inserted into the LM prompt. To begin to fully realize the potential of frozen LMs and RMs, we propose Demonstrate-Search-Predict (DSP), a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM. DSP can express high-level programs that bootstrap pipeline-aware demonstrations, search for relevant passages, and generate grounded predictions, systematically breaking down problems into small transformations that the LM and RM can handle more reliably. We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings, establishing in early evaluations new state-of-the-art in-context learning results and delivering 37-200%, 8-40%, and 80-290% relative gains against vanilla LMs, a standard retrieve-then-read pipeline, and a contemporaneous self-ask pipeline, respectively.

translated by 谷歌翻译

A Comprehensive Review on Autonomous Navigation

Saeid Nahavandi , Roohallah Alizadehsani , Darius Nahavandi , Shady Mohamed , Navid Mohajer , Mohammad Rokonuzzaman , Ibrahim Hossain

分类：机器人

2022-12-24

The field of autonomous mobile robots has undergone dramatic advancements over the past decades. Despite achieving important milestones, several challenges are yet to be addressed. Aggregating the achievements of the robotic community as survey papers is vital to keep the track of current state-of-the-art and the challenges that must be tackled in the future. This paper tries to provide a comprehensive review of autonomous mobile robots covering topics such as sensor types, mobile robot platforms, simulation tools, path planning and following, sensor fusion methods, obstacle avoidance, and SLAM. The urge to present a survey paper is twofold. First, autonomous navigation field evolves fast so writing survey papers regularly is crucial to keep the research community well-aware of the current status of this field. Second, deep learning methods have revolutionized many fields including autonomous navigation. Therefore, it is necessary to give an appropriate treatment of the role of deep learning in autonomous navigation as well which is covered in this paper. Future works and research gaps will also be discussed.

translated by 谷歌翻译

GraphCast: Learning skillful medium-range global weather forecasting

Remi Lam , Alvaro Sanchez-Gonzalez , Matthew Willson , Peter Wirnsberger , Meire Fortunato , Alexander Pritzel , Suman Ravuri , Timo Ewalds , Ferran Alet , Zach Eaton-Rosen

分类：机器学习

2022-12-24

We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines. GraphCast is an autoregressive model, based on graph neural networks and a novel high-resolution multi-scale mesh representation, which we trained on historical weather data from the European Centre for Medium-Range Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day forecasts, at 6-hour time intervals, of five surface variables and six atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer resolution at the equator. Our results show GraphCast is more accurate than ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the 2760 variable and lead time combinations we evaluated. GraphCast also outperforms the most accurate previous ML-based weather forecasting model on 99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast (35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike traditional forecasting methods, ML-based forecasting scales well with data: by training on bigger, higher quality, and more recent data, the skill of the forecasts can improve. Together these results represent a key step forward in complementing and improving weather modeling with ML, open new opportunities for fast, accurate forecasting, and help realize the promise of ML-based simulation in the physical sciences.

translated by 谷歌翻译

AsyncFLEO: Asynchronous Federated Learning for LEO Satellite Constellations with High-Altitude Platforms

Mohamed Elmahallawy , Tie Luo

分类：机器学习

2022-12-22

Low Earth Orbit (LEO) constellations, each comprising a large number of satellites, have become a new source of big data "from the sky". Downloading such data to a ground station (GS) for big data analytics demands very high bandwidth and involves large propagation delays. Federated Learning (FL) offers a promising solution because it allows data to stay in-situ (never leaving satellites) and it only needs to transmit machine learning model parameters (trained on the satellites' data). However, the conventional, synchronous FL process can take several days to train a single FL model in the context of satellite communication (Satcom), due to a bottleneck caused by straggler satellites. In this paper, we propose an asynchronous FL framework for LEO constellations called AsyncFLEO to improve FL efficiency in Satcom. Not only does AsynFLEO address the bottleneck (idle waiting) in synchronous FL, but it also solves the issue of model staleness caused by straggler satellites. AsyncFLEO utilizes high-altitude platforms (HAPs) positioned "in the sky" as parameter servers, and consists of three technical components: (1) a ring-of-stars communication topology, (2) a model propagation algorithm, and (3) a model aggregation algorithm with satellite grouping and staleness discounting. Our extensive evaluation with both IID and non-IID data shows that AsyncFLEO outperforms the state of the art by a large margin, cutting down convergence delay by 22 times and increasing accuracy by 40%.

translated by 谷歌翻译

Hardware Acceleration of Lane Detection Algorithm: A GPU Versus FPGA Comparison

Mohamed Alshemi , Sherif Saif , Mohamed Taher

分类：计算机视觉

2022-12-19

A Complete Computer vision system can be divided into two main categories: detection and classification. The Lane detection algorithm is a part of the computer vision detection category and has been applied in autonomous driving and smart vehicle systems. The lane detection system is responsible for lane marking in a complex road environment. At the same time, lane detection plays a crucial role in the warning system for a car when departs the lane. The implemented lane detection algorithm is mainly divided into two steps: edge detection and line detection. In this paper, we will compare the state-of-the-art implementation performance obtained with both FPGA and GPU to evaluate the trade-off for latency, power consumption, and utilization. Our comparison emphasises the advantages and disadvantages of the two systems.

translated by 谷歌翻译